Maintainer: Jianhai Zhang

1 Introduction

The rapid advance in high-throughput biotechnology (e.g. next-generation sequencing, microarray) has generated a large amount of biological data. Accordingly, tools are developed over time to visualise these data, such as eFP (Winter et al. 2007), ePlant (Waese et al. 2017), gganatogram (Maag 2018), brainR (Muschelli, Sweeney, and Crainiceanu 2014), etc. These tools map the gene expression data onto a pre-defined tissue image where the data are measured. The great feature is that they display the data intuitively and interactively and therefore could promote hypothesis generation. However, the visualisation is based on data and image pre-configured by developers. Therefore, users have difficuty in displaying their custom data. To address this issue, we developed the R/Bioconductor package spatialHeatmap. For the usage of this package, refer to the vignette.

Spatial heatmap plotting requires a pair of formatted data matrix and SVG image. To make it user-friendly, an SVG repository is developed, where users can download SVG images of interest. If the target SVG image is not available in this repository, users should make a custom SVG image. This tutorial explains the detailed process of making SVG images. To reproduce the results in this tutorial, all the used files are available to download.

2 SVG Repository

To make it user-friendly, an SVG repository across different species is created, where more SVG images will be added in future. These SVG templates are modified from EBI Gene Expression Group to be compatible with spatialHeatmap.

SVG images in the repository are all configured. If the SVG image of interest is available in this repository, users only need to download it (click an image, mouse over the image and right click, then select “Save image as…”) and set tissue ids.

If no SVG template is of interest in this repository, users can follow the step-by-step SVG tutorial below to create their custom SVG images.

3 Make SVG Images

To make SVG images, a PNG image with defined tissues and the SVG editor Inkscape are required. The image editor GIMP can be used if the tissue outlines in the PNG image are clear. Inkscape is used to draw the SVG image with the PNG image as a template, and format the SVG image in accordance with the data matrix. The values in data matrix are used to colour different tissues in spatial heatmaps. GIMP could be used to automatically extract polygons for the SVG image.

There are 3 different options to make SVG images: Draw Over Template Polygons, Use Regular Shapes, Use GIMP. If tissues in the template image have unclear outlines, the first 2 options have to be used, as GIMP is applicable to tissues with clear outlines.

3.1 Draw Over Template Polygons

Download the PNG image root.png (Mustroph et al. (2009), click the image, click “Download”, right click, and select “Save image as…”) and open it in Inkscape. Select “Draw Bezier curves and straight lines (shift+F6)” on the left tool bar.

Select “Fill and Stroke…” under “Object” tab on the top. On the right panel “Fill and Stroke (Shift+Ctrl+F)”, set “Stroke style” 3.000 px and press “Enter” key.

Press “+” key to zoom in and select a polygon to start. Left click at differencet corners of the polygon to draw an outline. At last, click at the first corner to seal the outline.

If the new polygon is filled with a colour, click “No paint” under the “Fill” tab on the panel “Fill and Stroke (Shift+Ctrl+F)”. Then a new sealed transparent polygon is drawn.

Select “Edit paths by nodes (F2)” on the left tool bar, and draw a rectangle over the new polygon. Select “Make selected nodes corner” on the top.

Drag nodes and edges to align the new polygon with template polygon. On the fill and stroke panel, under “Fill” tab, select “Flat color” and adjust the colors to label the new polygon. Then the first polygon is made successfully.

3.2 Use Regular Shapes

If the template polygons are similar to regular shapes such as rectangles, circles. The regular shapes can be used to make new polygons.

Select “Create rectangles and squares (F4)” on the left, and draw a rectangle over a polygon template. Convert this object to path by selecting “Object to Path” under “Path” tab on the top.

Click “No paint” under the “Fill” tab on the fill and stroke panel to make the rectangle transparent. Rotate the rectangle. Select “Edit paths by nodes (F2)” on the left tool bar. If necessary, add a node by double-clicking on an edge. Drag nodes and edges to align the rectangle with the underlying polygon template.

Select “Edit paths by nodes (F2)” on the left tool bar, and draw a rectangle over the new polygon. Select “Make selected nodes corner” on the top.

Drag the handles at nodes to adjust edges for fine alignment. On the fill and stroke panel select “Flat color” under “Fill” tab to colour this new polygon. The second polygon is successfully made.

3.3 Use GIMP

If polygons in the template PNG image have clear outlines, the SVG image can be extracted with GIMP.

3.3.1 Extract SVG image

Open the PNG template in GIMP, and open “Paths” panel. Right click and select “By Color”.

Now the polygons can be selected by colors. For exmaple, clicking on a whilte polygon selects all polygons in white. Right click, select “To Path”, then all the white polygons are extracted in the “Paths” panel. Similarly, extract the yellow polygons.

Click in front of each extracted polygons to show the eye symbol. Mouse over the extracted polygons, right click, select “Merge Visible Paths”. After merged, export the paths as an SVG image (root_gimp.svg). Next, edit the exported SVG image in Inkscape.

The exported SVG image “root_gimp.svg” is accessible here (hover over the image, right click, and select “Save image as…”).

3.3.2 Edit SVG image in Inkscape

Open the exported SVG image in Inkscape. Under “Object” tab at the top, select “Fill and Stroke…”. First click the SVG image and then adjust colours to fill it by using “Flat color” under “Fill” tab.

The all paths in SVG image generated in GIMP are combined as a whole. In order to separate the paths, first click the image then select “Break Apart” under the “Path” tab on the top. Now the paths are separated (i.e. polygons are separated), but the outlines of polygons are not stroked. Thus use “Ctrl+A” to select all polygons, and on the fill and stroke panel select “Flat color” under “Stroke paint” tab, and set a number under “Stroke style” tab (1.333 px).

Click the white area to unselect the whole image. Press “+” key to zoom in and try to move different polygons and delete those unnecessary by pressing “Delete” key.

Use “Ctrl+A” to select all polygons. Click “No paint” under “Fill” tab on the fill and stroke panel. Click the white area to unselect the whole image. The blank SVG image is ready to format with the data matrix.

4 Format SVG Image

This step is in accordance with the data matrix. The data matrix with rows and columns being genes and sample/conditions respectively is formatted by a targets file (a dataframe of column metadata). The targets file is a data frame with at least 2 columns defining replicates of samples and conditions respectively.

4.1 Requirements on the targets file

  1. It is a data frame with at least 2 columns. The rows corresponds with columns in the data matrix.

  2. The sample column specifies sample replicates. It is crucial that replicate names of the same sample must be identical.

  3. The condition column has the same requirement with the sample column.

  4. The name of sample and condition replicates should only consist of letters, digits, dots, single space, or single underscore.

Table 1 (Mustroph et al. 2009) is the formatted targets file according to the above requirements and will be used to format this SVG image, which is downloadable here (click the file, click “Raw”, right click, and select “Save as…”).

It is critical that the replicate names of same sample of condition must be identical. As shown in Table 1, there are 2 samples “root_pGL2”, “root_pCO2”, and each sample has identical replicate names, which is true for the 2 conditions.

Table 1 Targets file (metadata of data matrix column) of the data matrix. Promoter pGL2, pCO2, pSCR, pWOL labels root atrichoblast epidermis, root cortex meristematic zone, root endodermis, root vasculature respectively. Only the first 2 are shown.
samples conditions
root_control_pGL2_rep1 root_pGL2 control
root_control_pGL2_rep2 root_pGL2 control
root_control_pGL2_rep3 root_pGL2 control
root_hypoxia_pGL2_rep1 root_pGL2 hypoxia
root_hypoxia_pGL2_rep2 root_pGL2 hypoxia
root_control_pCO2_rep1 root_pCO2 control
root_control_pCO2_rep2 root_pCO2 control
root_hypoxia_pCO2_rep1 root_pCO2 hypoxia
root_hypoxia_pCO2_rep2 root_pCO2 hypoxia

To plot spatial heatmaps successfully, the SVG image must be formatted according to the following requirements.

4.2 Requirements on SVG format

  1. A path represents a shape. If a tissue consists of multiple paths and is expected to be colured in the spatial heatmap, all its paths must be grouped as a whole (labeled by tag “g”). The group id is the tissue id and the inside path ids are useless. A group should not include another group, which means all elements in a group should be single paths. However, if a multiple-path tissue is not expected to coloured in the spatial heatmap, there is no need to group them and the paths can keep random ids.

  2. If a tissue has only one path, it can stay as an individual path, no need to be formatted as a group.

  3. If a tissue is expected to be coloured in the spatial heatmaps, its “id” must be identical with corresponding tissue name from the targets file. It means even a difference of dot, space, underscore, uppercase, or lowercase matters.

  4. All the tissues (groups and single paths) must be placed in another container group as a whole, and this group must be the last element in the “XML Editor”.

The following section shows the process of formatting the SVG image.

Group same tissues

Take the pGL2 as an example. Select polygons of the this sample by clicking polygon edge while pressing “Shift” key. Mouse over any edge of selected polygons, right click and select “Group”.

Click “Flat color” under the “Fill” tab on the fill and stroke panel, and use a colour to fill the grouped polygons.

Set tissue id

Under the “Edit” tab on the top, select “XML Editor…”. Click the group to select it. On the “XML Editor (Shift+Ctrl+X)” panel, first click the id and type in “pGL2” then click “Set”. After that, the first group is done.

Similarly, group and set id for pCO2, pSCR, pWOL respectively. When group the small vasculature polygons in the center, a shortcut is to draw a rectangle over them to select all rather than click each individually. Note if a tissue sample contains only one polygon, then no need to group it, but the id should be identical with corresponding sample in the targets file in order to colour it in spatial heatmaps.

Brown, blue, orange, purple labels pGL2, pCO2, pSCR, pWOL respectively. The blank polygons have random ids and will not be coloured in the spatial heatmap.

Container group

At last, it is required to group all groups and independent paths as a container group. To do so, use “Ctrl+A” to select all, mouse over the polygons, right click, and click group. It is optional to change the container group id.

It is crucial that this container group must be the last element on the “XML Editor”.

By now, the SVG image is done. Save the formatted SVG image as “root.svg”, which is downloadable here. Note: the SVG name ends with “.svg” and the front should only consist of letters, digits, dots, or underscores. E.g. root(1).svg is not acceptable.

Reference

Maag, Jesper L V. 2018. “Gganatogram: An R Package for Modular Visualisation of Anatograms and Tissues Based on Ggplot2.” F1000Res. 7 (September): 1576.

Muschelli, John, Elizabeth Sweeney, and Ciprian Crainiceanu. 2014. “BrainR: Interactive 3 and 4D Images of High Resolution Neuroimage Data.” R J. 6 (1): 41–48.

Mustroph, Angelika, M Eugenia Zanetti, Charles J H Jang, Hans E Holtan, Peter P Repetti, David W Galbraith, Thomas Girke, and Julia Bailey-Serres. 2009. “Profiling Translatomes of Discrete Cell Populations Resolves Altered Cellular Priorities During Hypoxia in Arabidopsis.” Proc Natl Acad Sci U S A 106 (44): 18843–8.

Waese, Jamie, Jim Fan, Asher Pasha, Hans Yu, Geoffrey Fucile, Ruian Shi, Matthew Cumming, et al. 2017. “EPlant: Visualizing and Exploring Multiple Levels of Data for Hypothesis Generation in Plant Biology.” Plant Cell 29 (8): 1806–21.

Winter, Debbie, Ben Vinegar, Hardeep Nahal, Ron Ammar, Greg V Wilson, and Nicholas J Provart. 2007. “An ‘Electronic Fluorescent Pictograph’ Browser for Exploring and Analyzing Large-Scale Biological Data Sets.” PLoS One 2 (8): e718.